Objective

This lab aims to guide you through downloading American Community Survey (ACS) data, specifically focusing on income at the census tract level. By the end of this lab, you will be able to:

Access and download ACS data for one or multiple states.

Process and clean the data.

Perform basic analysis on income data.

Before starting, make sure you have the following packages installed.

#install.packages(“tidyverse”)

#install.packages(“tidycensus”)

#install.packages(“sf”)

#install.packages(“tigris”)

#install.packages(“mapview”)

Steps

Step 1: Load Required Libraries

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidycensus)
library(sf)
## Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(tigris)
## To enable caching of data, set `options(tigris_use_cache = TRUE)`
## in your R script or .Rprofile.
library(mapview)

Step 2: Set Up Census API Key

You need a Census API key to access ACS data. If you don’t have one, you can request it here at https://api.census.gov/data/key_signup.html Once you have your key, set it up in R.

#please replace the API Key
#census_api_key("e23a9a88f3a3911be51aed1a0e9c595a10e35b59", install = TRUE)
census_api_key("e23a9a88f3a3911be51aed1a0e9c595a10e35b59", overwrite = TRUE)
## To install your API key for use in future sessions, run this function with `install = TRUE`.
readRenviron("~/.Renviron")

Step 3: Download ACS Data

Use the tidycensus package to download ACS data. We’ll focus on median household income and education attainment levels for Virginia.

# Define variables for median household income
variables <- c(income = "B19013_001")
# Download data for Virginia
acs_data <- get_acs(
  geography = "tract",
  variables = variables,
  state = "VA",
  year = 2020,
  survey = "acs5",
  geometry = TRUE
)
## Getting data from the 2016-2020 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |==                                                                    |   4%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |==============                                                        |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |===============================================================       |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================|  99%
  |                                                                            
  |======================================================================| 100%
# View the first few rows of the data
head(acs_data)
## Simple feature collection with 6 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -78.78807 ymin: 37.51485 xmax: -77.18975 ymax: 38.89167
## Geodetic CRS:  NAD83
##         GEOID                                           NAME variable estimate
## 1 51087200409 Census Tract 2004.09, Henrico County, Virginia   income    54940
## 2 51760021000      Census Tract 210, Richmond city, Virginia   income       NA
## 3 51003010100   Census Tract 101, Albemarle County, Virginia   income    87000
## 4 51059471401 Census Tract 4714.01, Fairfax County, Virginia   income   116683
## 5 51059432500    Census Tract 4325, Fairfax County, Virginia   income   157426
## 6 51059492300    Census Tract 4923, Fairfax County, Virginia   income   112305
##     moe                       geometry
## 1 16759 MULTIPOLYGON (((-77.53545 3...
## 2    NA MULTIPOLYGON (((-77.40362 3...
## 3 24390 MULTIPOLYGON (((-78.78807 3...
## 4 21233 MULTIPOLYGON (((-77.20789 3...
## 5 14290 MULTIPOLYGON (((-77.26601 3...
## 6 23718 MULTIPOLYGON (((-77.25242 3...

Step 4: Rename columns for clarity

colnames(acs_data) <- c("GEOID", "NAME", "variable","Median_Income", "moe","geometry")

Step 5: Perform Basic Analysis

Perform basic analysis to explore income levels.

# Summary statistics for median income
summary(acs_data$Median_Income)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    2499   52132   72256   84248  106560  250001      46
# Plot median income distribution
ggplot(acs_data, aes(x = Median_Income)) +
  geom_histogram(binwidth = 5000, fill = "blue", color = "black") +
  labs(title = "Distribution of Median Household Income in Virginia",
       x = "Median Household Income",
       y = "Frequency")
## Warning: Removed 46 rows containing non-finite outside the scale range
## (`stat_bin()`).

Step 6: Map Visualization

Visualize the data on a map.

# Plot median income on a map
ggplot(acs_data) +
  geom_sf(aes(fill = Median_Income)) +
  scale_fill_viridis_c() +
  labs(title = "Median Household Income by Census Tract in Virginia",
       fill = "Median Income")

## Step 7: Download ACS/census data for multiple states

acs_data <- get_acs(geography = "tract", 
                   variables = "B19013_001",
                   state = c("DC", "MD", "VA"),
                   geometry = TRUE)
## Getting data from the 2016-2020 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
## Fetching tract data by state and combining the result.
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |======================================================================| 100%
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  21%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |=====================                                                 |  29%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  41%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |======================================================================| 100%
colnames(acs_data) <- c("GEOID", "NAME", "variable","Median_Income", "moe","geometry")

Step 7: use mapview to visualize the results:

mapview(acs_data,z='Median_Income')

Use the Gini index to quantify inequality in the DMV area

Now that we have income data for DC, VA, and MD, let’s focus on the DMV (District of Columbia, Maryland, and Virginia) area for some basic analysis. Specifically, we will quantify inequality using the Gini index. The Gini index is a measure of statistical dispersion intended to represent the income inequality within a region. A Gini index of 0 represents perfect equality, while an index of 1 indicates maximal inequality.

Step 1: important DMV shapefile

dmv<-st_read('dmv.shp')
## Reading layer `dmv' from data source 
##   `/Users/yshao/work/Geog4254-5254G/lab4/dmv.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 7 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -78.45 ymin: 37.99 xmax: -76.38 ymax: 39.72
## Geodetic CRS:  WGS 84
plot(dmv)

## Step 2: Select tracts by location (dmv boundary)

# Transform the CRS of the ACS data to match the DMV shapefile
acs_data <- st_transform(acs_data, st_crs(dmv))

# Perform the spatial subset
dmv_data <- acs_data[dmv, ,op = st_intersects]

Step 3: use mapview to visualize the results:

mapview(dmv_data,z='Median_Income')

Step 3: Calculate the Gini index for the entire DMV area

install ‘ineq’ library before you run the following code

install.packages(‘ineq’)

library(ineq)
gini_index <- Gini(dmv_data$Median_Income)

# Print the Gini index
print(gini_index)
## [1] 0.2261062

Step 4: Write Data to a Shapefile

Save the processed data as a shapefile.

st_write(dmv_data, "acs_data_dmv.shp", delete_layer = TRUE)
## Warning in abbreviate_shapefile_names(obj): Field names abbreviated for ESRI
## Shapefile driver
## Writing layer `acs_data_dmv' to data source 
##   `acs_data_dmv.shp' using driver `ESRI Shapefile'
## Writing 1546 features with 5 fields and geometry type Multi Polygon.

Lab questions:

  1. Identify and Report the Tract with the Highest Median Income in DMV

  2. Identify and Report the Tract with the Lowest Median Income in DMV

  3. Calculate the Gini index for the Richmond, VA Metro Area. The spatial boundary of the Richmond Metro Area [Richmond_metro.shp] is included in the Lab 4 folder. You will need to revise the script, particularly the section on selecting by location, as well as parts of the lab steps to obtain the answer.

  4. Create a map visualizing the median household income for each tract within the Richmond VA Metro area using the tmap package. Attach the resulting map.